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DETAILED ACTION 
Response to Amendment 

1. The amendments to the claims have been entered. Claims 1, 2, 15, 17, 19, and 
21 are currently amended and new claims 23-30 have been added. 

Claim Objections 

2. The amendments to the claims overcome the objections made in the previous 
Office Action. The objections to the claims are withdrawn. 

Response to Arguments 

3. Applicant's arguments filed May 23, 2005 have been fully considered but they are 
not persuasive. 

The Applicant has alleged that the subject matter relied upon (Fig. 24, and 
paragraphs 322-323 of Maes) does not appear in either its parent application (U.S. 
Patent Application 09/703,574) or its provisional application (U.S. Patent Application 
60/277,770), therefore Maes (U.S. Patent Application Publication 2002/0184373) does 

» 

not qualify as prior art (see page 15, last paragraph). 

The filing date of Maes (U.S. Patent Application Publication 2002/0184373) is not 
sufficient is not earlier than the instant application's filing date, therefore, the validity of 
the parent application and provisional application are critical for Maes to qualify as prior 
art. The Examiner maintains that the subject matter that was used in the rejections of 
the previous Office Action can be found in either the parent application or provisional 
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application. Specifically, the provisional application (U.S. Patent Application 
60/277,770) contains an identical (save for element labels) figure as the relied upon Fig. 
24 of the previous rejections (see page 1 of the provisional application). Furthermore, 
Fig. 1 of the provisional application has the same GUI browser for receiving and 
displaying displayable content as now required by currently amended claims 15, 17, 19, 
and 21. 

Therefore, the rejections made in the previous Office Action stand. 

4. A copy of the parent application and provisional application listed on the bottom 
of the accompanying Notice of References Cited (PTO-892) is not included with this 
Office Action. Should the applicant desire a copy of such a provisional application, 
applicant should promptly request the copy from the Office of Public Records (OPR) in 
accordance with 37 CFR 1 .14(a)(1 )(iv), paying the required fee under 37 CFR 

1 .19(b)(1). If a copy is ordered from OPR, the shortened statutory period for reply to this 
Office action will not be reset under MPEP § 710.06 unless applicant can demonstrate a 
substantial delay by the Office in fulfilling the order for the copy of the provisional 
application. 

5. Furthermore, with regard to the use of official notice in the rejections of claim 5, it 
is noted that the applicant has not made any attempt to traverse the assertion of official 
notice, therefore the well-known in the art statement is taken to be admitted prior art 
(see MPEP 2144.03). 
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Claim Rejections - 35 USC § 102 

6. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351(a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21 (2) 
of such treaty in the English language. 

7. Claims 1-4, 6-15, 17-19, and 21-22 are rejected under 35 U.S.C. 102(e) as being 
anticipated by Maes (U.S. Patent Application Publication 2002/0184373). 

In regard to claim 1, Maes discloses a DSR system (Fig. 24) comprising: 

a client (client 3000) to send connection requests, receive displayable content, 
and transmit speech feature data to a server (client 3000 includes a communications 
manager 3021 that provides synchronization protocols 3022 for synchronizing the GUI 
browser by processing event information (sending requests and receiving displayable 
content), page 32, paragraph 322, lines 12-15 and paragraph 324, lines 1-4; and 
transports DSR encoded data to the DSR decoders 3012, page 32, paragraph 323); 

a gateway coupled between the client and the server to support data 
communication between the client and the server (EDGE server 3025 includes a 
gateway 3026 to convert client and server requests across different network protocols, 
page 32, paragraph 325); and 

a server (3004) to receive the speech feature data, perform speech recognition 
on the speech feature data, and transmit displayable content to the client (DSR decoder 
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3012 decodes the DSR encoded data and conversational engines 301 1 perform speech 
recognition, page 32, paragraph 323 and paragraph 326, lines 1-3; multi-modal shell 

3013 updates the views displayed through GUI browser 3001, page 31, paragraph 321, 
line 21 through page 32, 1 st column, lines 1-3). 

In regard to claim 2, Maes discloses: 

client wrapper API to interface with a DSR client browser (wrapper 3003, page 
31, paragraph 321, lines 12-16); 

a DSR frame constructor coupled to the client wrapper API to construct DSR 
frames; 

a DSR payload wrapper coupled to the DSR frame constructor to construct DSR 
payload packets from the DSR frames (communications manager 3021 supports the 
voice coding and transport protocols of 3023, page 32, paragraph 323; the DSR 
transport protocol layer implements RTP and RTCP to packet the speech data, page 
21, paragraph 210, lines 1-4; the RTP packages frame pairs (FP) into DSR payloads, 
see Fig. 29(a) and page 11, paragraph 1 18, lines 16-23); and 

a DSRML client transceiver to receive displayable content and to send an initial 
connection request to the server (multi-modal shell 3013 updates the views displayed 
through GUI browser 3001, page 31, paragraph 321, line 21 through page 32, 1 st 
column, lines 1-3; communications manager 3021 includes synchronization protocols 
3022 that remotely control the conversational speech engines using the SERCP 
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protocol, page 32, paragraph 324; the SERCP protocol being a DSR protocol based on 
XML, page 26, paragraph 263, lines 1-3). 

In regard to claim 3, Maes discloses a client transmission/recognition adapter to 
adjust transmission control conditions of the DSR payload wrapper and to control flag 
bits needed for speech recognition according to transmission/recognition parameters; 
and 

said DSR payload wrapper to add flag bits to the DSR payload packets (the RTP 
protocol generates a header that contains various flag bits indicating the size and type 
of the payload frames, see Fig. 7, and page 11, paragraph 118, lines 1-3; and appends 
RTP header to the payload frame pairs, see Fig. 29(a) and page 1 1 , paragraph 119, 
lines 1-8). 

In regard to claim 4, Maes discloses a client protocol stack having a TCP module 
supporting TCP protocol and an IP module supporting IP protocol (edge server 3025 
communicates with the client through the TCP/IP protocol, page 32, paragraph 325, 
lines 1-5). 

In regard to claim 6, Maes discloses a feature compressor coupled to the client 
wrapper API and the DSR frame constructor to compress speech feature data (page 9, 
paragraph 101, lines 1-5). 
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In regard to claim 7 Maes discloses: 

a DSR payload de-wrapper to separate DSR speech feature data from 
transmission/recognition parameters; 

a DSR frame extractor coupled to the DSR payload de-wrapper to extract DSR 
frames (DSR decoder 3012 decodes the encoded voice data to provide feature data to 
speech recognizer 301 1 , page 32, paragraph 323); 

a server wrapper API coupled to the DSR frame extractor to interface with a DSR 
server browser (wrapper 3008, page 31 , paragraph 321 , lines 12-16); and 

a DSRML server transceiver to send displayable content and to receive an initial 
connection request from the client (multi-modal shell 3013 updates the views displayed 
through GUI browser 3001, page 31, paragraph 321, line 21 through page 32, 1 st 
column, lines 1-3; communications manager 3021 includes synchronization protocols 
3022 that remotely control the conversational speech engines using the SERCP 
protocol, page 32, paragraph 324; the SERCP protocol being a DSR protocol based on 
XML, page 26, paragraph 263, lines 1-3). 

In regard to claim 8, Maes discloses a server stack having a UDP module to 
support UDP protocol, the server further including: 

an RTP receiver to receive DSR payload packets using RTP through UDP/IP 
protocol stacks and extracting DSR payload from the DSR payload packets; and 

a server transmission/recognition adapter coupled to the DSR payload de- 
wrapper and the DSR frame extractor to control frame extraction according to 
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transmission parameters and flag bits for speech recognition (voice coding and 
transport protocols 3023, page 32, paragraph 323; implement RTP and RTCP to packet 
and control the encoded speech data over the network, page 21, paragraph 210, lines 
1-4; over a UDP layer (see Fig. 20, 2007), page 20, paragraph 203, lines 17-18; the 
server must inherently include the necessary extractors and de-wrappers for decoding 
the speech feature data). 

In regard to claim 9, Maes discloses a feature compressor on the client side 
compresses the feature data before transmission to the server (Fig. 2b, 214). 

In regard to claims 10 and 11, Maes discloses the gateway supports both wired 
and wireless data communication (3G wireless network and H.323 wired network, page 
20, paragraph 203, lines 5-11). 

In regard to claim 12, Maes discloses a web server coupled to the server via a 
network (content server 3014, page 31 , paragraph 320, lines 9-1 1 ). 

In regard to claim 13, Maes discloses a front-end engine for reducing noise and 
to extract the speed feature data (page 22, paragraph 228). 

In regard to claim 14, Maes discloses the displayable content is represented as a 
DSRML document (multi-modal shell 3013 updates the views displayed through GUI 



Application/Control Number: 10/057,161 Page 9 

Art Unit: 2655 

browser 3001, page 31, paragraph 321, line 21 through page 32, 1 st column, lines 1-3; 
communications manager 3021 includes synchronization protocols 3022 that remotely 
control the conversational speech engines using the SERCP protocol, page 32, 
paragraph 324; the SERCP protocol being a DSR protocol based on XML, page 26, 
paragraph 263, lines 1-3). 

In regard to claims 15 and 19, Maes discloses a method and machine readable 
medium having executable code stored thereon to perform the method, comprising: 

receiving displayable content associated with a markup language document 
(client 3000 includes a communications manager 3021 that provides synchronization 
protocols 3022 for synchronizing the GUI browser by processing event information 
(sending requests and receiving displayable content), page 32, paragraph 322, lines 12- 
15 and paragraph 324, lines 1-4; and transports DSR encoded data to the DSR 
decoders 3012, page 32, paragraph 323; the GUI information being sent through the 
DOM 3002 and 3007 via standard markup language documents, page 31, paragraph 
321, lines 1-9); 

receiving input speech data; 

extracting speech features from the input speech data; 

packaging the speech features into DSR frames in a DSR frame format (Fig. 2a, 
MFCC's are extracted from input speech, page 9, paragraph 100; MFCC features are a 
framed format) J 
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collecting DSR frames to form a DSR payload (page 10, paragraph 111, lines 1- 
4); and 

transmitting the DSR payload to a server for speech recognition processing 
(page 10, paragraph 110, lines 11-13). 

In regard to claims 17 and 21, Maes discloses a method and machine readable 
medium having executable code stored thereon to perform the method, comprising: 

sending displayable content associated with a markup language document (on 
the server side, synchronization manager 3009 synchronizes the GUI user interface 
event information, page 32 paragraph 322, lines 15-17; the GUI information being sent 
through the DOM 3002 and 3007 via standard markup language documents, page 31 , 
paragraph 321, lines 1-9) 

receiving a DSR payload packet; 

de-wrapping DSR payload from the DSR payload packet and separating DSR 
speech feature data from transmission/recognition parameters; 

extracting DSR frames from the DSR payload (the DSR decoder must inherently 
have the necessary extractors and de-wrappers to recover the speech feature data); 

extracting speech feature data from the DSR frames (Fig. 2b, 213 and page 9, 
paragraph 104, lines 7-10); and 

sending the speech feature data to a speech recognition engine and for 
recognition (Fig. 2b, 212, page 9, paragraph 104, lines 5-7). 
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In regard to claims 18 and 22, Maes discloses de-compressing the speech 
feature data (decompression module 214, page 9, paragraph 104, lines 7-9). 

In regard to claim 23, Maes discloses a client for a distributed speech recognition 
system (Fig. 24, client), comprising: 

an engine to extract speech features of a speech input (client 3000 includes a 
communications manager 3021 that provides synchronization protocols 3022 for 
synchronizing the GUI browser by processing event information (sending requests and 
receiving displayable content), page 32, paragraph 322, lines 12-15 and paragraph 324, 
lines 1-4; and transports DSR encoded data to the DSR decoders 3012, page 32, 
paragraph 323); 

a frame constructor to generate frames comprising extracted speech features 
(Fig. 2a, MFCC's are extracted from input speech, page 9, paragraph 100; MFCC 
features are a framed format); 

a payload wrapper to construct payload packets from the frames comprising 
extracted speech features (page 10, paragraph 111, lines 1-4); and 

a transceiver to receive displayable content and to send an initial connection 
request to a server (client 3000 includes a communications manager 3021 that provides 
synchronization protocols 3022 for synchronizing the GUI browser by processing event 
information (sending requests and receiving displayable content), page 32, paragraph 
322, lines 12-15 and paragraph 324, lines 1-4; and transports DSR encoded data to the 
DSR decoders 3012, page 32, paragraph 323). 
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In regard to claim 24, Maes discloses an adapter to adjust transmission control 
conditions of the payload wrapper according to the transmission/recognition 
parameters, wherein 

the payload wrapper is to add flag bits to the payload packets based upon 
transmission/recognition parameters received from the adapter (the RTP protocol 
generates a header that contains various flag bits indicating the size and type of the 
payload frames, see Fig. 7, and page 11, paragraph 118, lines 1-3; and appends RTP 
header to the payload frame pairs, see Fig. 29(a) and page 11, paragraph 119, lines 1- 
8). 

In regard to claim 26, Maes discloses a TCP stack (edge server 3025 
communicates with the client through the TCP/IP protocol, page 32, paragraph 325, 
lines 1-5); 

a UDP stack (see Fig. 20, 2007, page 20, paragraph 203, lines 17-18); 

an adapter to adjust transmission control conditions of the payload wrapper 
based upon transmission/recognition parameters (the RTP protocol generates a header 
that contains various flag bits indicating the size and type of the payload frames, see 
Fig. 7, and page 1 1 , paragraph 118, lines 1-3; and appends RTP header to the payload 
frame pairs, see Fig. 29(a) and page 11, paragraph 119, lines 1-8); wherein 

said payload wrapper is to select between sending the payload packets to the 
server via the TCP stack and sending the payload packets to the server via RTP and 
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the UDP stack based upon transmission/recognition parameters of the adapter (either 
TCP or UDP can be selected, page 20, paragraph 203, lines 17-19). 

In regard to claim 27, Maes discloses a compressor to provide the frame 
constructor with compressed speech features for frames (page 9, paragraph 101 , lines 
1-5). 

In regard to claim 28, Maes discloses a server for a distributed speech 
recognition system (Fig. 24, server), comprising: 

a payload de-wrapper to obtain transmission parameters and flag bits from 
pay load packets received from a client; 

a frame extractor to extract frames comprising speech features from the received 
payload packets (DSR decoder 3012 decodes the encoded voice data to provide 
feature data to speech recognizer 301 1 , page 32, paragraph 323); 

a speech recognition engine to recognize speech from the speech features of the 
extracted frames (engines 301 1 perform speech recognition, page 32, paragraph 326, 
lines 1-3); and 

a transceiver to receive an initial connection request from the client and to send 
the client displayable content based upon the recognized speech (multi-modal shell 
3013 updates the views displayed through GUI browser 3001, page 31, paragraph 321, 
line 21 through page 32, 1 st column, lines 1-3; communications manager 3021 includes 
synchronization protocols 3022 that remotely control the conversational speech engines 
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using the SERCP protocol, page 32, paragraph 324; the SERCP protocol being a DSR 
protocol based on XML, page 26, paragraph 263, lines 1-3). 

In regard to claim 29, Maes discloses: 

a RTP receiver to receive payload packets; and 

an adapter to control frame extraction of the frame extractor according to the 
transmission parameters and flag bits obtained by the payload de-wrapper (voice 
coding and transport protocols 3023, page 32, paragraph 323; implement RTP and 
RTCP to packet and control the encoded speech data over the network, page 21, 
paragraph 210, lines 1-4; over a UDP layer (see Fig. 20, 2007), page 20, paragraph 
203, lines 17-18; the server must inherently include the necessary extractors and de- 
* wrappers for decoding the speech feature data). 

In regard to claim 30, Maes discloses a frame de-compressor to de-compress 
speech features of the extracted speech frames (decompression module 214, page 9, 
paragraph 104, lines 7-9). 

Claim Rejections - 35 USC § 103 

8. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 
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9. Claims 5 and 25 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Maes. 

Maes discloses that the voice coding and transport protocols (3023, page 32, 
paragraph 323) implement RTP and RTCP to packet and control the encoded speech 
data over the network (page 21 , paragraph 210, lines 1-4) over a UDP layer (see Fig. 
20, 2007, page 20, paragraph 203, lines 17-18). 

Maes does not specifically disclose that the RTP sender includes a buffer to 
store packets that have been sent, but not acknowledged by the server, and 
retransmitting the RTP packets until they are acknowledge by the server. 

The Applicant's admitted prior art discloses to include a buffer in an RTP 
implementation over UDP, since UDP does not include any assurance that packets 
have reached their destination. Including a buffer to store the RTP until they are 
acknowledged as being received by an RTCP packet ensures that every packet 
reaches the server. This would be especially important in a DSR system, since any loss 
of speech feature packets would result in extreme recognition errors. 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Maes to include an RTP buffer, in order to ensure that every packet 
reached the server, thereby ensuring the best recognition results possible. 

10. Claims 16 and 20 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Maes, as applied to claims 15 and 19, above, in view of Allman et al. (Increasing 
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TCP's Initial Window), in further view of Nishimura et al. (U.S. Patent 6,754,200), and 
further in view of Mathis et al. (TCP Selective Acknowledgement Options). 

Maes discloses passing the DSR payload to a transport protocol stack composed 
of TCP and IP (client and server communicate through the TCP/IP protocol (page 32, 
paragraph 325). 

Maes does not disclose: 

increasing a TCP initial window; 

Allman et al. disclose increasing a TCP initial window reduce the transmission 
time for connections transmitting only a small amount of data (page 3, section 3, second 
paragraph). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Maes to increase the TCP initial window, in order to reduce the 
transmission time of the speech features (which are a small amount of data), so that the 
overall latency of the system was reduced, providing a faster response time to a user's 
spoken commands. 

Neither Maes nor Allman et al. disclose: 

adopting no slow-start restart; 

Nishimura et al. (U.S. Patent 6,754,200) disclose a system that adopts a no slow- 
start restart when a discarded packet is not congestion related (column 9, lines 42-46). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify the combination of Maes and Allman et al. to adopt a no 
slow-start restart so that there would be no waiting time for retransmission if a packet 
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was lost, as taught by Nishimura et al. (column 9, lines 56-57). Again, this would 
reduce the overall latency of the system, providing a faster response time to a user's 
spoken commands. 

Neither Maes nor Allman et al. nor Nishimura et al. disclose applying a TCP 

SACK. 

Mathis et al. disclose multiple packet losses have a catastrophic on TCP 
throughput because it causes the client (sender) to retransmit segments which have 
already been correctly received (page 2, first paragraph, lines 1-7). SACK corrects that 
by ensuring that the client (sender) only needs to retransmit packets that have actually 
been lost (page 2, second paragraph). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify the combination of Maes, Allman et al., and Nishimura et al. 
to apply TCP SACK to increase the overall throughput of the system in situations where 
there was multiple packet losses, again reducing the overall latency of the system and 
providing a faster response time to a user's spoken commands. 

Conclusion 

1 1 . THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1 .1 36(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
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mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

12. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Brian L. Albertalli whose telephone number is (571) 272- 
7616. The examiner can normally be reached on Mon - Fri, 8:00 AM - 5:30 PM, every 
second Fri off. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Wayne Young can be reached on (571) 272-7582. The fax phone number 
for the organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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